Statistical-based System for Morphological Annotation of Arabic Texts

نویسندگان

  • Nabil Khoufi
  • Manel Boudokhane
چکیده

In this paper, we propose a corpus-based method for the annotation of Arabic texts with morphological information. The proposed method proceeds in two stages: the segmentation stage and the morphological analysis stage. The morphological analysis stage is based on a statistical method using an annotated corpus. In order to evaluate our method, we conducted a comparative analysis between the results generated by our system AMAS (Arabic Morphological Annotation System) and those carried out by a human expert. As input, the system accepts an Arabic text and generates as a result an annotated text with morphological information in XML format.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Part-of-Speech Tagger for Traditional Arabic Texts

Problem statement: This study presented the development of an Arabic part-of-speech tagger that can be used for analyzing and annotating traditional Arabic texts, especially the Quran text. Approach: It is a part of a project related to the computerization of the Holy Quran. One of the main objectives in this project was to build a textual corpus of the Holy Quran. Results: Since an appropriate...

متن کامل

SHAKKIL: An Automatic Diacritization System for Modern Standard Arabic Texts

This paper sheds light on a system that would be able to diacritize Arabic texts automatically (SHAKKIL). In this system, the diacritization problem will be handled through two levels; morphological and syntactic processing levels. The adopted morphological disambiguation algorithm depends on four layers; Uni-morphological form layer, rule-based morphological disambiguation layer, statistical-b...

متن کامل

Characteristics of Arabic Identity in Intellectual System of Hisham Kalbi based on his Books on Genealogy

Science of "Genealogy" was one of the branches of History and Historiography during the age of Jāhilīyah (age of ignorance) which has grown rapidly in the Islamic era. In this context, Hisham Kalbi (d. 204 AH. / 819 AD.), as the first author and editor of Genealogy, has a great contribution to the formation and prosperity of this science, with two important texts, the Jamharat Al-Ansab and Nasa...

متن کامل

Iranian EFL Learners L2 Reading Comprehension: The Effect of Online Annotations via Interactive White Boards

This study explores the effect of online annotations via Interactive White Boards (IWBs) on reading comprehension of Iranian EFL learners. To this aim, 60 students from a language institute were selected as homogeneous based on their performance on Oxford Placement Test (2014).Then, they were randomly assigned to 3 experimental groups of 20, and subsequently exposed to the research treatment af...

متن کامل

Hybrid approaches for automatic vowelization of Arabic texts

Hybrid approaches for automatic vowelization of Arabic texts are presented in this article. The process is made up of two modules. In the first one, a morphological analysis of the text words is performed using the open source morphological Analyzer AlKhalil Morpho Sys. Outputs for each word analyzed out of context, are its different possible vowelizations. The integration of this Analyzer in o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013